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Abstract: 

The architecture of a rapid-context-switching processor called APRIL, with support for 
fine-grain threads and synchronization. Is described. APRIL achieves high single-thread 
performance and supports virtual dynamic threads. A commercial reduced-instruction- 
set-computer-(RISC-) based Implementation of APRIL and a run-time software system 
that can switch contexts in about 10 cycles are described. Measurements taken for 
several parallel applications on an APRIL simulator show that the overhead for 
supporting parallel tasks based on futures is reduced by a factor of 2 over a 
corresponding implementation on the Encore Multimax. The scalability of a 
multiprocessor based on APRIL is explored using a performance model. The authors 
show that the SPARC-based implementation of APRIL can achieve close to 80% 
processor utilization with as few as three resident threads per processor in a large-scale 
cache-based machine with an average base network latency of 55 cycles 
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